Universal lossless compression via multilevel pattern matching
نویسندگان
چکیده
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length , the MPM code operates at (log log ) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A (1 log ) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported.
منابع مشابه
Study On Universal Lossless Data Compression by using Context Dependence Multilevel Pattern Matching Grammar Transform
In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover, it is proved that this algorithms’ worst case redundancy among all individual sequences of length n from a finite alphabet is upper bounded by ) log / 1 ( n C w...
متن کاملContext-dependent multilevel pattern matching for lossless image compression
| The performance of the multilevel pattern matching (MPM) code for lossless image compression is rst analyzed. It is shown that the worst-case redundancy of the MPM code against all nite 2D context arithmetic codes is O(1= p log n), where n is the number of pixels in the image to be compressed. This result is in contrast to the redundancy of O(1= log n) in the case of 1D data and is caused by ...
متن کاملUniversal lossless data compression with side information by using a conditional MPM grammar transform
A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. Among several recently proposed grammar transforms is the multilevel pattern matching (MPM) grammar ...
متن کاملSource coding, large deviations, and approximate pattern matching
In this review paper, we present a development of parts of rate-distortion theory and pattern-matching algorithms for lossy data compression, centered around a lossy version of the asymptotic equipartition property (AEP). This treatment closely parallels the corresponding development in lossless compression, a point of view that was advanced in an important paper of Wyner and Ziv in 1989. In th...
متن کاملA universal predictor based on pattern matching
We consider a universal predictor based on pattern matching: Given a sequence X1; : : : ; Xn drawn from a stationary mixing source, it predicts the next symbol Xn+1 based on selecting a context of Xn+1. The predictor, called the Sampled Pattern Matching (SPM), is a modi cation of the Ehrenfeucht{Mycielski pseudo random generator algorithm. It predicts the value of the most frequent symbol appea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Information Theory
دوره 46 شماره
صفحات -
تاریخ انتشار 2000